audit-fixup: wire juniper_data_datasets_cached gauge to cache layer#92
Merged
audit-fixup: wire juniper_data_datasets_cached gauge to cache layer#92
Conversation
The juniper_data_datasets_cached Gauge was defined in juniper_data/api/observability.py with a tested set_datasets_cached() helper, but had zero production callers -- the gauge would always read 0 in scrape output. juniper-ml#223 (post-METRICS-MON state report §15) flagged this as the only remaining test-vs-production gap. Wires the gauge into CachedDatasetStore (the canonical cache layer) via a private _emit_cached_count() helper that probes the cache backend for its current population and publishes via set_datasets_cached(). Failures are swallowed so observability never breaks the storage path -- mirrors the contextlib.suppress discipline already used everywhere else in the class. Cache mutation sites wired: - save() (write_through path) cached.py:88 - get_artifact_bytes() (read-through) cached.py:131-132 - delete() cached.py:162-163 - invalidate_cache() cached.py:216 - warm_cache() (post-batch) cached.py:248-249 update_meta() is intentionally NOT wired -- it mutates an existing entry, never changes cardinality. Tests: six new tests in test_cached_store.py cover insert / evict / invalidate / warm / read-through populate / no-write-through. Uses the _value.get() pattern + a registry-reset fixture matching test_observability.py. Local results: pytest juniper_data/tests/unit/ -x 813 passed pre-commit run --files cached.py test_cached_store.py all pass Refs: juniper-ml#223 §15 Tracker: juniper-ml notes/code-review/POST_METRICS_MON_TRACKER_2026-05-05.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
4 tasks
pcalnon
added a commit
that referenced
this pull request
May 6, 2026
…_reuse (Phase 2a) (#95) * refactor(observability): migrate to juniper-observability register_or_reuse (Phase 2a) Phase 2a of the migration plan in ``juniper-ml/notes/observability/REGISTER_OR_REUSE_HELPER_DESIGN_2026-05-05.md``. Drops the inline ``_get_or_create`` helper added in PR #87 and calls the canonical ``juniper_observability.register_or_reuse`` shipped in juniper-observability ``0.2.0``. ### What changes - ``juniper_data/api/observability.py``: ``_ensure_dataset_metrics`` now imports ``register_or_reuse`` from ``juniper_observability`` (lazy, inside the ``if _dataset_metrics is None`` branch) and uses it for all four collectors (Counter ×2 + Histogram + Gauge). The inline ``_get_or_create`` closure and the manual ``REGISTRY._names_to_collectors`` lookup are gone. - ``pyproject.toml``: ``juniper-observability>=0.1.1`` → ``juniper-observability>=0.2.0``. The new helper is the reason for the bump; existing 0.1.1 callers' behaviour is unchanged. ### Drive-by fix: storage circular import PR #92 (``audit-fixup: wire juniper_data_datasets_cached gauge to cache layer``) added ``from juniper_data.api.observability import set_datasets_cached`` at the top of ``juniper_data/storage/cached.py``. That triggers the import chain: juniper_data.storage.__init__ → cached.py → juniper_data.api.observability → juniper_data.api.__init__ → app.py → juniper_data.storage (← still initialising!) → ImportError: cannot import name 'LocalFSDatasetStore' …which broke ``pytest --collect-only`` on origin/main as of 2026-05-06: 31 collection errors covering every test that imports ``juniper_data.storage`` directly or transitively. ``--collect-only`` amplifies the breakage, but a normal full pytest run also hits the same path during fixture collection. Fix: defer the ``set_datasets_cached`` import to inside ``CachedDatasetStore._emit_cached_count`` so the cycle never fires at module-import time. The function is best-effort with a ``except Exception`` swallowing failures, so a deferred import that raises ``ImportError`` (e.g. during a ``CachedDatasetStore`` instantiation in a non-API context) gets logged at DEBUG and skipped the same way any other observability failure does. ### Verification Full juniper-data suite under JuniperData env: **950 passed** (was ``31 errors during collection`` on origin/main pre-fix). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ruff linter fix --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wires the previously-dead
juniper_data_datasets_cachedPrometheus Gauge to theCachedDatasetStorecache layer. The metric and itsset_datasets_cached(n)helper have lived injuniper_data/api/observability.pysince R3, with helper-level test coverage, but had zero production callers -- so the gauge always read 0 in scrape output.This is the only test-vs-production gap surfaced by the post-METRICS-MON state report (juniper-ml#223 §15) -- a parallel PR is adding it to the post-METRICS-MON tracker (
juniper-ml notes/code-review/POST_METRICS_MON_TRACKER_2026-05-05.md).What changed
A new private
CachedDatasetStore._emit_cached_count()helper probes the cache backend'slist_datasets()(probe limit10_000, matchingwarm_cache()) and publishes the count viaset_datasets_cached(). Failures are caught and logged at DEBUG so the observability path can never break the storage path (mirrors thecontextlib.suppress(Exception)discipline already used throughout the class).Cache mutation sites wired:
save()(write-through path)juniper_data/storage/cached.py:88get_artifact_bytes()(read-through cache populate)juniper_data/storage/cached.py:131-132delete()juniper_data/storage/cached.py:162-163invalidate_cache()juniper_data/storage/cached.py:216warm_cache()(post-batch)juniper_data/storage/cached.py:248-249update_meta()is intentionally not wired -- it mutates an existing entry, never changes cardinality.Gauge semantics: the metric's HELP string is
"Number of datasets currently cached in storage", so this is Option A (count, not bytes).set_datasets_cached(int)lines up.Test plan
TestDatasetsCachedGaugeclass injuniper_data/tests/unit/test_cached_store.py(6 tests):test_save_emits_cache_count-- gauge tracks insert count up to 3test_delete_emits_decremented_cache_count-- gauge decrements on evicttest_invalidate_cache_emits_decremented_cache_count-- cache-only evict, primary untouchedtest_warm_cache_emits_populated_count-- batch populate emits final counttest_get_artifact_bytes_read_through_emits_cache_count-- read-through populate emitstest_save_without_write_through_does_not_emit-- baseline: cache untouched, gauge untouched_value.get()on the registered Gauge plus a registry-reset fixture matchingtest_observability.py's autouse cleanup, to guard againstDuplicated timeseries in CollectorRegistry.pytest juniper_data/tests/unit/ -x-- 813 passed (pre-existing 807 + 6 new)pre-commit run --files juniper_data/storage/cached.py juniper_data/tests/unit/test_cached_store.py-- all hooks pass (ruff lint, ruff format, mypy prod, mypy tests, bandit)Refs
notes/code-review/POST_METRICS_MON_TRACKER_2026-05-05.md(parallel PR adds this as a tracked item)Closes the only test-vs-production gap from the state report.